Template for test


In [ ]:
from pred import Predictor
from pred import sequence_vector
from pred import chemical_vector

Controlling for Random Negatve vs Sans Random in Imbalanced Techniques using S, T, and Y Phosphorylation.

Included is N Phosphorylation however no benchmarks are available, yet.

Training data is from phospho.elm and benchmarks are from dbptm.

Note: SMOTEEN seems to preform best


In [ ]:
par = ["pass", "ADASYN", "SMOTEENN", "random_under_sample", "ncl", "near_miss"]
for i in par:
    print("y", i)
    y = Predictor()
    y.load_data(file="Data/Training/k_acetylation.csv")
    y.process_data(vector_function="sequence", amino_acid="K", imbalance_function=i, random_data=0)
    y.supervised_training("bagging")
    y.benchmark("Data/Benchmarks/acet.csv", "K")
    del y
    print("x", i)
    x = Predictor()
    x.load_data(file="Data/Training/k_acetylation.csv")
    x.process_data(vector_function="sequence", amino_acid="K", imbalance_function=i, random_data=1)
    x.supervised_training("bagging")
    x.benchmark("Data/Benchmarks/acet.csv", "K")
    del x

In [ ]: